116 research outputs found
Constructing Light Spanners Deterministically in Near-Linear Time
Graph spanners are well-studied and widely used both in theory and practice. In a recent breakthrough, Chechik and Wulff-Nilsen [Shiri Chechik and Christian Wulff-Nilsen, 2018] improved the state-of-the-art for light spanners by constructing a (2k-1)(1+epsilon)-spanner with O(n^(1+1/k)) edges and O_epsilon(n^(1/k)) lightness. Soon after, Filtser and Solomon [Arnold Filtser and Shay Solomon, 2016] showed that the classic greedy spanner construction achieves the same bounds. The major drawback of the greedy spanner is its running time of O(mn^(1+1/k)) (which is faster than [Shiri Chechik and Christian Wulff-Nilsen, 2018]). This makes the construction impractical even for graphs of moderate size. Much faster spanner constructions do exist but they only achieve lightness Omega_epsilon(kn^(1/k)), even when randomization is used.
The contribution of this paper is deterministic spanner constructions that are fast, and achieve similar bounds as the state-of-the-art slower constructions. Our first result is an O_epsilon(n^(2+1/k+epsilon\u27)) time spanner construction which achieves the state-of-the-art bounds. Our second result is an O_epsilon(m + n log n) time construction of a spanner with (2k-1)(1+epsilon) stretch, O(log k * n^(1+1/k) edges and O_epsilon(log k * n^(1/k)) lightness. This is an exponential improvement in the dependence on k compared to the previous result with such running time. Finally, for the important special case where k=log n, for every constant epsilon>0, we provide an O(m+n^(1+epsilon)) time construction that produces an O(log n)-spanner with O(n) edges and O(1) lightness which is asymptotically optimal. This is the first known sub-quadratic construction of such a spanner for any k = omega(1).
To achieve our constructions, we show a novel deterministic incremental approximate distance oracle. Our new oracle is crucial in our construction, as known randomized dynamic oracles require the assumption of a non-adaptive adversary. This is a strong assumption, which has seen recent attention in prolific venues. Our new oracle allows the order of the edge insertions to not be fixed in advance, which is critical as our spanner algorithm chooses which edges to insert based on the answers to distance queries. We believe our new oracle is of independent interest
Smart City Analytics: Ensemble-Learned Prediction of Citizen Home Care
We present an ensemble learning method that predicts large increases in the
hours of home care received by citizens. The method is supervised, and uses
different ensembles of either linear (logistic regression) or non-linear
(random forests) classifiers. Experiments with data available from 2013 to 2017
for every citizen in Copenhagen receiving home care (27,775 citizens) show that
prediction can achieve state of the art performance as reported in similar
health related domains (AUC=0.715). We further find that competitive results
can be obtained by using limited information for training, which is very useful
when full records are not accessible or available. Smart city analytics does
not necessarily require full city records.
To our knowledge this preliminary study is the first to predict large
increases in home care for smart city analytics
Near-optimal labeling schemes for nearest common ancestors
We consider NCA labeling schemes: given a rooted tree , label the nodes of
with binary strings such that, given the labels of any two nodes, one can
determine, by looking only at the labels, the label of their nearest common
ancestor.
For trees with nodes we present upper and lower bounds establishing that
labels of size , are both sufficient and
necessary. (All logarithms in this paper are in base 2.)
Alstrup, Bille, and Rauhe (SIDMA'05) showed that ancestor and NCA labeling
schemes have labels of size . Our lower bound
increases this to for NCA labeling schemes. Since
Fraigniaud and Korman (STOC'10) established that labels in ancestor labeling
schemes have size , our new lower bound separates
ancestor and NCA labeling schemes. Our upper bound improves the
upper bound by Alstrup, Gavoille, Kaplan and Rauhe (TOCS'04), and our
theoretical result even outperforms some recent experimental studies by Fischer
(ESA'09) where variants of the same NCA labeling scheme are shown to all have
labels of size approximately
Simpler, faster and shorter labels for distances in graphs
We consider how to assign labels to any undirected graph with n nodes such
that, given the labels of two nodes and no other information regarding the
graph, it is possible to determine the distance between the two nodes. The
challenge in such a distance labeling scheme is primarily to minimize the
maximum label lenght and secondarily to minimize the time needed to answer
distance queries (decoding). Previous schemes have offered different trade-offs
between label lengths and query time. This paper presents a simple algorithm
with shorter labels and shorter query time than any previous solution, thereby
improving the state-of-the-art with respect to both label length and query time
in one single algorithm. Our solution addresses several open problems
concerning label length and decoding time and is the first improvement of label
length for more than three decades.
More specifically, we present a distance labeling scheme with label size (log
3)/2 + o(n) (logarithms are in base 2) and O(1) decoding time. This outperforms
all existing results with respect to both size and decoding time, including
Winkler's (Combinatorica 1983) decade-old result, which uses labels of size
(log 3)n and O(n/log n) decoding time, and Gavoille et al. (SODA'01), which
uses labels of size 11n + o(n) and O(loglog n) decoding time. In addition, our
algorithm is simpler than the previous ones. In the case of integral edge
weights of size at most W, we present almost matching upper and lower bounds
for label sizes. For r-additive approximation schemes, where distances can be
off by an additive constant r, we give both upper and lower bounds. In
particular, we present an upper bound for 1-additive approximation schemes
which, in the unweighted case, has the same size (ignoring second order terms)
as an adjacency scheme: n/2. We also give results for bipartite graphs and for
exact and 1-additive distance oracles
Sublinear Distance Labeling
A distance labeling scheme labels the nodes of a graph with binary
strings such that, given the labels of any two nodes, one can determine the
distance in the graph between the two nodes by looking only at the labels. A
-preserving distance labeling scheme only returns precise distances between
pairs of nodes that are at distance at least from each other. In this paper
we consider distance labeling schemes for the classical case of unweighted
graphs with both directed and undirected edges.
We present a bit -preserving distance labeling
scheme, improving the previous bound by Bollob\'as et. al. [SIAM J. Discrete
Math. 2005]. We also give an almost matching lower bound of
. With our -preserving distance labeling scheme as a
building block, we additionally achieve the following results:
1. We present the first distance labeling scheme of size for sparse
graphs (and hence bounded degree graphs). This addresses an open problem by
Gavoille et. al. [J. Algo. 2004], hereby separating the complexity from
distance labeling in general graphs which require bits, Moon [Proc.
of Glasgow Math. Association 1965].
2. For approximate -additive labeling schemes, that return distances
within an additive error of we show a scheme of size for .
This improves on the current best bound of by
Alstrup et. al. [SODA 2016] for sub-polynomial , and is a generalization of
a result by Gawrychowski et al. [arXiv preprint 2015] who showed this for
.Comment: A preliminary version of this paper appeared at ESA'1
Sequence Modelling For Analysing Student Interaction with Educational Systems
The analysis of log data generated by online educational systems is an
important task for improving the systems, and furthering our knowledge of how
students learn. This paper uses previously unseen log data from Edulab, the
largest provider of digital learning for mathematics in Denmark, to analyse the
sessions of its users, where 1.08 million student sessions are extracted from a
subset of their data. We propose to model students as a distribution of
different underlying student behaviours, where the sequence of actions from
each session belongs to an underlying student behaviour. We model student
behaviour as Markov chains, such that a student is modelled as a distribution
of Markov chains, which are estimated using a modified k-means clustering
algorithm. The resulting Markov chains are readily interpretable, and in a
qualitative analysis around 125,000 student sessions are identified as
exhibiting unproductive student behaviour. Based on our results this student
representation is promising, especially for educational systems offering many
different learning usages, and offers an alternative to common approaches like
modelling student behaviour as a single Markov chain often done in the
literature.Comment: The 10th International Conference on Educational Data Mining 201
Distance labeling schemes for trees
We consider distance labeling schemes for trees: given a tree with nodes,
label the nodes with binary strings such that, given the labels of any two
nodes, one can determine, by looking only at the labels, the distance in the
tree between the two nodes.
A lower bound by Gavoille et. al. (J. Alg. 2004) and an upper bound by Peleg
(J. Graph Theory 2000) establish that labels must use
bits\footnote{Throughout this paper we use for .}. Gavoille et.
al. (ESA 2001) show that for very small approximate stretch, labels use
bits. Several other papers investigate various
variants such as, for example, small distances in trees (Alstrup et. al.,
SODA'03).
We improve the known upper and lower bounds of exact distance labeling by
showing that bits are needed and that bits are sufficient. We also give ()-stretch labeling
schemes using bits for constant .
()-stretch labeling schemes with polylogarithmic label size have
previously been established for doubling dimension graphs by Talwar (STOC
2004).
In addition, we present matching upper and lower bounds for distance labeling
for caterpillars, showing that labels must have size . For simple paths with nodes and edge weights in , we show that
labels must have size
- …